Overview

Dataset statistics

 Dataset ADataset B
Number of variables1212
Number of observations446446
Missing cells428435
Missing cells (%)8.0%8.1%
Duplicate rows00
Duplicate rows (%)0.0%0.0%
Total size in memory45.3 KiB45.3 KiB
Average record size in memory104.0 B104.0 B

Variable types

 Dataset ADataset B
Numeric55
Categorical44
Text33

Alerts

Dataset ADataset B
Fare is highly overall correlated with PclassAlert not present in this datasetHigh correlation
Pclass is highly overall correlated with FareAlert not present in this datasetHigh correlation
Sex is highly overall correlated with Survived Sex is highly overall correlated with SurvivedHigh correlation
Survived is highly overall correlated with Sex Survived is highly overall correlated with SexHigh correlation
Age has 85 (19.1%) missing valuesAge has 92 (20.6%) missing valuesMissing
Cabin has 342 (76.7%) missing valuesCabin has 343 (76.9%) missing valuesMissing
PassengerId has unique valuesPassengerId has unique valuesUnique
Name has unique valuesName has unique valuesUnique
SibSp has 307 (68.8%) zerosSibSp has 311 (69.7%) zerosZeros
Parch has 334 (74.9%) zerosParch has 353 (79.1%) zerosZeros
Fare has 6 (1.3%) zerosFare has 6 (1.3%) zerosZeros

Reproduction

 Dataset ADataset B
Analysis started2025-11-21 05:12:36.2231562025-11-21 05:12:38.091184
Analysis finished2025-11-21 05:12:38.0886042025-11-21 05:12:39.960627
Duration1.87 second1.87 second
Software versionydata-profiling v0.0.dev0ydata-profiling v0.0.dev0
Download configurationconfig.jsonconfig.json

Variables

PassengerId
Real number (ℝ)

 Dataset ADataset B
Distinct446446
Distinct (%)100.0%100.0%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean441.1861429.84978
 Dataset ADataset B
Minimum11
Maximum891889
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-11-21T05:12:40.042932image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum11
5-th percentile30.547.25
Q1222.25211.25
median441.5405.5
Q3662.75646.75
95-th percentile844.75832.5
Maximum891889
Range890888
Interquartile range (IQR)440.5435.5

Descriptive statistics

 Dataset ADataset B
Standard deviation261.69006253.2678
Coefficient of variation (CV)0.59315120.58920072
Kurtosis-1.2158191-1.1896159
Mean441.1861429.84978
Median Absolute Deviation (MAD)221215.5
Skewness-0.0102416220.057135374
Sum196769191713
Variance68481.68964144.577
MonotonicityNot monotonicNot monotonic
2025-11-21T05:12:40.157639image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2321
 
0.2%
3791
 
0.2%
1201
 
0.2%
8851
 
0.2%
4851
 
0.2%
2931
 
0.2%
741
 
0.2%
7251
 
0.2%
3381
 
0.2%
3341
 
0.2%
Other values (436)436
97.8%
ValueCountFrequency (%)
6541
 
0.2%
5271
 
0.2%
5701
 
0.2%
721
 
0.2%
3771
 
0.2%
2941
 
0.2%
1211
 
0.2%
731
 
0.2%
1371
 
0.2%
5531
 
0.2%
Other values (436)436
97.8%
ValueCountFrequency (%)
11
0.2%
21
0.2%
31
0.2%
41
0.2%
51
0.2%
71
0.2%
81
0.2%
91
0.2%
101
0.2%
111
0.2%
ValueCountFrequency (%)
11
0.2%
21
0.2%
31
0.2%
51
0.2%
71
0.2%
81
0.2%
101
0.2%
111
0.2%
121
0.2%
131
0.2%
ValueCountFrequency (%)
11
0.2%
21
0.2%
31
0.2%
51
0.2%
71
0.2%
81
0.2%
101
0.2%
111
0.2%
121
0.2%
131
0.2%
ValueCountFrequency (%)
11
0.2%
21
0.2%
31
0.2%
41
0.2%
51
0.2%
71
0.2%
81
0.2%
91
0.2%
101
0.2%
111
0.2%

Survived
Categorical

 Dataset ADataset B
Distinct22
Distinct (%)0.4%0.4%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
0
274 
1
172 
0
290 
1
156 

Length

 Dataset ADataset B
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters446446
Distinct characters22
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st row01
2nd row01
3rd row01
4th row00
5th row11

Common Values

ValueCountFrequency (%)
0274
61.4%
1172
38.6%
ValueCountFrequency (%)
0290
65.0%
1156
35.0%

Length

2025-11-21T05:12:40.245936image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2025-11-21T05:12:40.291529image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:40.319644image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0274
61.4%
1172
38.6%
ValueCountFrequency (%)
0290
65.0%
1156
35.0%

Most occurring characters

ValueCountFrequency (%)
0274
61.4%
1172
38.6%
ValueCountFrequency (%)
0290
65.0%
1156
35.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0274
61.4%
1172
38.6%
ValueCountFrequency (%)
0290
65.0%
1156
35.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0274
61.4%
1172
38.6%
ValueCountFrequency (%)
0290
65.0%
1156
35.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0274
61.4%
1172
38.6%
ValueCountFrequency (%)
0290
65.0%
1156
35.0%

Pclass
Categorical

 Dataset ADataset B
Distinct33
Distinct (%)0.7%0.7%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
3
248 
1
108 
2
90 
3
251 
1
108 
2
87 

Length

 Dataset ADataset B
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters446446
Distinct characters33
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st row33
2nd row32
3rd row33
4th row33
5th row13

Common Values

ValueCountFrequency (%)
3248
55.6%
1108
24.2%
290
 
20.2%
ValueCountFrequency (%)
3251
56.3%
1108
24.2%
287
 
19.5%

Length

2025-11-21T05:12:40.370383image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2025-11-21T05:12:40.417815image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:40.453491image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
3248
55.6%
1108
24.2%
290
 
20.2%
ValueCountFrequency (%)
3251
56.3%
1108
24.2%
287
 
19.5%

Most occurring characters

ValueCountFrequency (%)
3248
55.6%
1108
24.2%
290
 
20.2%
ValueCountFrequency (%)
3251
56.3%
1108
24.2%
287
 
19.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3248
55.6%
1108
24.2%
290
 
20.2%
ValueCountFrequency (%)
3251
56.3%
1108
24.2%
287
 
19.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3248
55.6%
1108
24.2%
290
 
20.2%
ValueCountFrequency (%)
3251
56.3%
1108
24.2%
287
 
19.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3248
55.6%
1108
24.2%
290
 
20.2%
ValueCountFrequency (%)
3251
56.3%
1108
24.2%
287
 
19.5%

Name
['Text', 'Text']

 Dataset ADataset B
Distinct446446
Distinct (%)100.0%100.0%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-11-21T05:12:40.705739image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length8267
Median length5149
Mean length26.82735426.630045
Min length1213

Characters and Unicode

 Dataset ADataset B
Total characters1196511877
Distinct characters6059
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique446446 ?
Unique (%)100.0%100.0%

Sample

 Dataset ADataset B
1st rowLarsson, Mr. Bengt EdvinO'Leary, Miss. Hanora "Norah"
2nd rowBetros, Mr. TannousRidsdale, Miss. Lucy
3rd rowAndersson, Miss. Ellis Anna MariaJonsson, Mr. Carl
4th rowSutehall, Mr. Henry JrGoodwin, Miss. Lillian Amy
5th rowBishop, Mr. Dickinson HLandergren, Miss. Aurora Adelia
ValueCountFrequency (%)
mr259
 
14.2%
miss91
 
5.0%
mrs68
 
3.7%
john25
 
1.4%
william24
 
1.3%
master22
 
1.2%
henry20
 
1.1%
george12
 
0.7%
edward11
 
0.6%
james11
 
0.6%
Other values (866)1277
70.2%
ValueCountFrequency (%)
mr270
 
14.9%
miss89
 
4.9%
mrs65
 
3.6%
william30
 
1.7%
henry23
 
1.3%
john23
 
1.3%
master15
 
0.8%
thomas13
 
0.7%
james12
 
0.7%
charles11
 
0.6%
Other values (887)1257
69.5%
2025-11-21T05:12:41.222389image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1376
 
11.5%
r981
 
8.2%
e866
 
7.2%
a835
 
7.0%
s657
 
5.5%
n655
 
5.5%
i617
 
5.2%
M555
 
4.6%
l513
 
4.3%
o495
 
4.1%
Other values (50)4415
36.9%
ValueCountFrequency (%)
1363
 
11.5%
r967
 
8.1%
e847
 
7.1%
a835
 
7.0%
i652
 
5.5%
s647
 
5.4%
n645
 
5.4%
M567
 
4.8%
l547
 
4.6%
o474
 
4.0%
Other values (49)4333
36.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)11965
100.0%
ValueCountFrequency (%)
(unknown)11877
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1376
 
11.5%
r981
 
8.2%
e866
 
7.2%
a835
 
7.0%
s657
 
5.5%
n655
 
5.5%
i617
 
5.2%
M555
 
4.6%
l513
 
4.3%
o495
 
4.1%
Other values (50)4415
36.9%
ValueCountFrequency (%)
1363
 
11.5%
r967
 
8.1%
e847
 
7.1%
a835
 
7.0%
i652
 
5.5%
s647
 
5.4%
n645
 
5.4%
M567
 
4.8%
l547
 
4.6%
o474
 
4.0%
Other values (49)4333
36.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)11965
100.0%
ValueCountFrequency (%)
(unknown)11877
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1376
 
11.5%
r981
 
8.2%
e866
 
7.2%
a835
 
7.0%
s657
 
5.5%
n655
 
5.5%
i617
 
5.2%
M555
 
4.6%
l513
 
4.3%
o495
 
4.1%
Other values (50)4415
36.9%
ValueCountFrequency (%)
1363
 
11.5%
r967
 
8.1%
e847
 
7.1%
a835
 
7.0%
i652
 
5.5%
s647
 
5.4%
n645
 
5.4%
M567
 
4.8%
l547
 
4.6%
o474
 
4.0%
Other values (49)4333
36.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)11965
100.0%
ValueCountFrequency (%)
(unknown)11877
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1376
 
11.5%
r981
 
8.2%
e866
 
7.2%
a835
 
7.0%
s657
 
5.5%
n655
 
5.5%
i617
 
5.2%
M555
 
4.6%
l513
 
4.3%
o495
 
4.1%
Other values (50)4415
36.9%
ValueCountFrequency (%)
1363
 
11.5%
r967
 
8.1%
e847
 
7.1%
a835
 
7.0%
i652
 
5.5%
s647
 
5.4%
n645
 
5.4%
M567
 
4.8%
l547
 
4.6%
o474
 
4.0%
Other values (49)4333
36.5%

Sex
Categorical

 Dataset ADataset B
Distinct22
Distinct (%)0.4%0.4%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
male
286 
female
160 
male
293 
female
153 

Length

 Dataset ADataset B
Max length66
Median length44
Mean length4.71748884.6860987
Min length44

Characters and Unicode

 Dataset ADataset B
Total characters21042090
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowmalefemale
2nd rowmalefemale
3rd rowfemalemale
4th rowmalefemale
5th rowmalefemale

Common Values

ValueCountFrequency (%)
male286
64.1%
female160
35.9%
ValueCountFrequency (%)
male293
65.7%
female153
34.3%

Length

2025-11-21T05:12:41.300108image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2025-11-21T05:12:41.345396image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:41.373442image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
male286
64.1%
female160
35.9%
ValueCountFrequency (%)
male293
65.7%
female153
34.3%

Most occurring characters

ValueCountFrequency (%)
e606
28.8%
m446
21.2%
a446
21.2%
l446
21.2%
f160
 
7.6%
ValueCountFrequency (%)
e599
28.7%
m446
21.3%
a446
21.3%
l446
21.3%
f153
 
7.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)2104
100.0%
ValueCountFrequency (%)
(unknown)2090
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e606
28.8%
m446
21.2%
a446
21.2%
l446
21.2%
f160
 
7.6%
ValueCountFrequency (%)
e599
28.7%
m446
21.3%
a446
21.3%
l446
21.3%
f153
 
7.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2104
100.0%
ValueCountFrequency (%)
(unknown)2090
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e606
28.8%
m446
21.2%
a446
21.2%
l446
21.2%
f160
 
7.6%
ValueCountFrequency (%)
e599
28.7%
m446
21.3%
a446
21.3%
l446
21.3%
f153
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2104
100.0%
ValueCountFrequency (%)
(unknown)2090
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e606
28.8%
m446
21.2%
a446
21.2%
l446
21.2%
f160
 
7.6%
ValueCountFrequency (%)
e599
28.7%
m446
21.3%
a446
21.3%
l446
21.3%
f153
 
7.3%

Age
Real number (ℝ)

 Dataset ADataset B
Distinct7872
Distinct (%)21.6%20.3%
Missing8592
Missing (%)19.1%20.6%
Infinite00
Infinite (%)0.0%0.0%
Mean29.09027730.15678
 Dataset ADataset B
Minimum0.421
Maximum7480
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-11-21T05:12:41.453377image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum0.421
5-th percentile36
Q12021
median2829
Q33838
95-th percentile5454
Maximum7480
Range73.5879
Interquartile range (IQR)1817

Descriptive statistics

 Dataset ADataset B
Standard deviation14.30006413.737735
Coefficient of variation (CV)0.491575390.45554382
Kurtosis0.0937561310.3903608
Mean29.09027730.15678
Median Absolute Deviation (MAD)8.58
Skewness0.256307840.43052165
Sum10501.5910675.5
Variance204.49184188.72535
MonotonicityNot monotonicNot monotonic
2025-11-21T05:12:41.574405image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2417
 
3.8%
3016
 
3.6%
2113
 
2.9%
2213
 
2.9%
2513
 
2.9%
3612
 
2.7%
1812
 
2.7%
2812
 
2.7%
3512
 
2.7%
1912
 
2.7%
Other values (68)229
51.3%
(Missing)85
 
19.1%
ValueCountFrequency (%)
1917
 
3.8%
2215
 
3.4%
2115
 
3.4%
3213
 
2.9%
1813
 
2.9%
2411
 
2.5%
3111
 
2.5%
2811
 
2.5%
2611
 
2.5%
2911
 
2.5%
Other values (62)226
50.7%
(Missing)92
20.6%
ValueCountFrequency (%)
0.421
 
0.2%
0.671
 
0.2%
0.751
 
0.2%
0.831
 
0.2%
0.921
 
0.2%
15
1.1%
27
1.6%
33
0.7%
44
0.9%
53
0.7%
ValueCountFrequency (%)
13
0.7%
24
0.9%
34
0.9%
43
0.7%
53
0.7%
62
0.4%
71
 
0.2%
81
 
0.2%
93
0.7%
111
 
0.2%
ValueCountFrequency (%)
13
0.7%
24
0.9%
34
0.9%
43
0.7%
53
0.7%
62
0.4%
71
 
0.2%
81
 
0.2%
93
0.7%
111
 
0.2%
ValueCountFrequency (%)
0.421
 
0.2%
0.671
 
0.2%
0.751
 
0.2%
0.831
 
0.2%
0.921
 
0.2%
15
1.1%
27
1.6%
33
0.7%
44
0.9%
53
0.7%

SibSp
Real number (ℝ)

 Dataset ADataset B
Distinct77
Distinct (%)1.6%1.6%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean0.50448430.49103139
 Dataset ADataset B
Minimum00
Maximum88
Zeros307311
Zeros (%)68.8%69.7%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-11-21T05:12:41.652515image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile00
Q100
median00
Q311
95-th percentile22
Maximum88
Range88
Interquartile range (IQR)11

Descriptive statistics

 Dataset ADataset B
Standard deviation1.09697341.0509109
Coefficient of variation (CV)2.1744452.1402113
Kurtosis20.52051419.237256
Mean0.50448430.49103139
Median Absolute Deviation (MAD)00
Skewness3.96948463.769624
Sum225219
Variance1.20335061.1044138
MonotonicityNot monotonicNot monotonic
2025-11-21T05:12:41.708953image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0307
68.8%
1105
 
23.5%
214
 
3.1%
48
 
1.8%
36
 
1.3%
84
 
0.9%
52
 
0.4%
ValueCountFrequency (%)
0311
69.7%
199
 
22.2%
214
 
3.1%
310
 
2.2%
47
 
1.6%
83
 
0.7%
52
 
0.4%
ValueCountFrequency (%)
0307
68.8%
1105
 
23.5%
214
 
3.1%
36
 
1.3%
48
 
1.8%
52
 
0.4%
84
 
0.9%
ValueCountFrequency (%)
0311
69.7%
199
 
22.2%
214
 
3.1%
310
 
2.2%
47
 
1.6%
52
 
0.4%
83
 
0.7%
ValueCountFrequency (%)
0311
69.7%
199
 
22.2%
214
 
3.1%
310
 
2.2%
47
 
1.6%
52
 
0.4%
83
 
0.7%
ValueCountFrequency (%)
0307
68.8%
1105
 
23.5%
214
 
3.1%
36
 
1.3%
48
 
1.8%
52
 
0.4%
84
 
0.9%

Parch
Real number (ℝ)

 Dataset ADataset B
Distinct77
Distinct (%)1.6%1.6%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean0.392376680.35650224
 Dataset ADataset B
Minimum00
Maximum66
Zeros334353
Zeros (%)74.9%79.1%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-11-21T05:12:41.762078image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile00
Q100
median00
Q30.750
95-th percentile22
Maximum66
Range66
Interquartile range (IQR)0.750

Descriptive statistics

 Dataset ADataset B
Standard deviation0.835092780.85369803
Coefficient of variation (CV)2.12829362.3946498
Kurtosis12.45656313.053339
Mean0.392376680.35650224
Median Absolute Deviation (MAD)00
Skewness3.06498833.2723044
Sum175159
Variance0.697379960.72880032
MonotonicityNot monotonicNot monotonic
2025-11-21T05:12:41.818942image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0334
74.9%
169
 
15.5%
235
 
7.8%
54
 
0.9%
32
 
0.4%
61
 
0.2%
41
 
0.2%
ValueCountFrequency (%)
0353
79.1%
151
 
11.4%
232
 
7.2%
54
 
0.9%
43
 
0.7%
32
 
0.4%
61
 
0.2%
ValueCountFrequency (%)
0334
74.9%
169
 
15.5%
235
 
7.8%
32
 
0.4%
41
 
0.2%
54
 
0.9%
61
 
0.2%
ValueCountFrequency (%)
0353
79.1%
151
 
11.4%
232
 
7.2%
32
 
0.4%
43
 
0.7%
54
 
0.9%
61
 
0.2%
ValueCountFrequency (%)
0353
79.1%
151
 
11.4%
232
 
7.2%
32
 
0.4%
43
 
0.7%
54
 
0.9%
61
 
0.2%
ValueCountFrequency (%)
0334
74.9%
169
 
15.5%
235
 
7.8%
32
 
0.4%
41
 
0.2%
54
 
0.9%
61
 
0.2%

Ticket
['Text', 'Text']

 Dataset ADataset B
Distinct375385
Distinct (%)84.1%86.3%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-11-21T05:12:42.163249image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1818
Median length1717
Mean length6.64573996.8766816
Min length33

Characters and Unicode

 Dataset ADataset B
Total characters29643067
Distinct characters3235
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique322340 ?
Unique (%)72.2%76.2%

Sample

 Dataset ADataset B
1st row347067330919
2nd row2648W./C. 14258
3rd row347082350417
4th rowSOTON/OQ 392076CA 2144
5th row11967C 7077
ValueCountFrequency (%)
pc32
 
5.7%
c.a12
 
2.1%
ca8
 
1.4%
27
 
1.2%
ston/o7
 
1.2%
a/56
 
1.1%
sc/paris5
 
0.9%
16015
 
0.9%
soton/oq4
 
0.7%
a/44
 
0.7%
Other values (394)471
84.0%
ValueCountFrequency (%)
pc33
 
5.7%
a/510
 
1.7%
c.a10
 
1.7%
29
 
1.6%
ston/o9
 
1.6%
ca7
 
1.2%
soton/o.q6
 
1.0%
w./c5
 
0.9%
3470885
 
0.9%
a/44
 
0.7%
Other values (408)480
83.0%
2025-11-21T05:12:42.594631image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3371
12.5%
1339
11.4%
2295
10.0%
7250
8.4%
4232
7.8%
6217
 
7.3%
0203
 
6.8%
5193
 
6.5%
9159
 
5.4%
8133
 
4.5%
Other values (22)572
19.3%
ValueCountFrequency (%)
3372
12.1%
1357
11.6%
2295
9.6%
7258
 
8.4%
4227
 
7.4%
0214
 
7.0%
6197
 
6.4%
5187
 
6.1%
9163
 
5.3%
8147
 
4.8%
Other values (25)650
21.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)2964
100.0%
ValueCountFrequency (%)
(unknown)3067
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3371
12.5%
1339
11.4%
2295
10.0%
7250
8.4%
4232
7.8%
6217
 
7.3%
0203
 
6.8%
5193
 
6.5%
9159
 
5.4%
8133
 
4.5%
Other values (22)572
19.3%
ValueCountFrequency (%)
3372
12.1%
1357
11.6%
2295
9.6%
7258
 
8.4%
4227
 
7.4%
0214
 
7.0%
6197
 
6.4%
5187
 
6.1%
9163
 
5.3%
8147
 
4.8%
Other values (25)650
21.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2964
100.0%
ValueCountFrequency (%)
(unknown)3067
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3371
12.5%
1339
11.4%
2295
10.0%
7250
8.4%
4232
7.8%
6217
 
7.3%
0203
 
6.8%
5193
 
6.5%
9159
 
5.4%
8133
 
4.5%
Other values (22)572
19.3%
ValueCountFrequency (%)
3372
12.1%
1357
11.6%
2295
9.6%
7258
 
8.4%
4227
 
7.4%
0214
 
7.0%
6197
 
6.4%
5187
 
6.1%
9163
 
5.3%
8147
 
4.8%
Other values (25)650
21.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2964
100.0%
ValueCountFrequency (%)
(unknown)3067
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3371
12.5%
1339
11.4%
2295
10.0%
7250
8.4%
4232
7.8%
6217
 
7.3%
0203
 
6.8%
5193
 
6.5%
9159
 
5.4%
8133
 
4.5%
Other values (22)572
19.3%
ValueCountFrequency (%)
3372
12.1%
1357
11.6%
2295
9.6%
7258
 
8.4%
4227
 
7.4%
0214
 
7.0%
6197
 
6.4%
5187
 
6.1%
9163
 
5.3%
8147
 
4.8%
Other values (25)650
21.2%

Fare
Real number (ℝ)

 Dataset ADataset B
Distinct188182
Distinct (%)42.2%40.8%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean30.00020432.643834
 Dataset ADataset B
Minimum00
Maximum263512.3292
Zeros66
Zeros (%)1.3%1.3%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2025-11-21T05:12:42.697865image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile7.2257.225
Q17.89587.8958
median14.7514.4542
Q332.87530.5
95-th percentile103.19375130.2375
Maximum263512.3292
Range263512.3292
Interquartile range (IQR)24.979222.6042

Descriptive statistics

 Dataset ADataset B
Standard deviation36.80440752.129992
Coefficient of variation (CV)1.22680521.5969323
Kurtosis11.65552834.581323
Mean30.00020432.643834
Median Absolute Deviation (MAD)7.57.17295
Skewness2.98200734.9188093
Sum13380.09114559.15
Variance1354.56442717.536
MonotonicityNot monotonicNot monotonic
2025-11-21T05:12:42.813136image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1320
 
4.5%
8.0520
 
4.5%
2618
 
4.0%
7.7516
 
3.6%
7.895815
 
3.4%
7.229211
 
2.5%
7.77510
 
2.2%
10.59
 
2.0%
7.9258
 
1.8%
7.2258
 
1.8%
Other values (178)311
69.7%
ValueCountFrequency (%)
8.0524
 
5.4%
7.7520
 
4.5%
7.895816
 
3.6%
2616
 
3.6%
1316
 
3.6%
10.513
 
2.9%
7.92510
 
2.2%
7.77510
 
2.2%
7.259
 
2.0%
7.85429
 
2.0%
Other values (172)303
67.9%
ValueCountFrequency (%)
06
1.3%
4.01251
 
0.2%
51
 
0.2%
6.43751
 
0.2%
6.951
 
0.2%
6.9751
 
0.2%
7.04581
 
0.2%
7.053
0.7%
7.05421
 
0.2%
7.1252
 
0.4%
ValueCountFrequency (%)
06
1.3%
51
 
0.2%
6.23751
 
0.2%
6.43751
 
0.2%
6.451
 
0.2%
6.49581
 
0.2%
6.751
 
0.2%
6.85831
 
0.2%
7.055
1.1%
7.05422
 
0.4%
ValueCountFrequency (%)
06
1.3%
51
 
0.2%
6.23751
 
0.2%
6.43751
 
0.2%
6.451
 
0.2%
6.49581
 
0.2%
6.751
 
0.2%
6.85831
 
0.2%
7.055
1.1%
7.05422
 
0.4%
ValueCountFrequency (%)
06
1.3%
4.01251
 
0.2%
51
 
0.2%
6.43751
 
0.2%
6.951
 
0.2%
6.9751
 
0.2%
7.04581
 
0.2%
7.053
0.7%
7.05421
 
0.2%
7.1252
 
0.4%

Cabin
['Text', 'Text']

 Dataset ADataset B
Distinct8889
Distinct (%)84.6%86.4%
Missing342343
Missing (%)76.7%76.9%
Memory size7.0 KiB7.0 KiB
2025-11-21T05:12:43.106198image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1511
Median length33
Mean length3.45192313.3300971
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters359343
Distinct characters1919
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique7378 ?
Unique (%)70.2%75.7%

Sample

 Dataset ADataset B
1st rowB49D47
2nd rowDC106
3rd rowE8E58
4th rowE40C2
5th rowC70D20
ValueCountFrequency (%)
g63
 
2.6%
b492
 
1.7%
b182
 
1.7%
d332
 
1.7%
b962
 
1.7%
b982
 
1.7%
f42
 
1.7%
b352
 
1.7%
f22
 
1.7%
d262
 
1.7%
Other values (89)96
82.1%
ValueCountFrequency (%)
f23
 
2.6%
d3
 
2.6%
f333
 
2.6%
g62
 
1.8%
d262
 
1.8%
b512
 
1.8%
b532
 
1.8%
b552
 
1.8%
d202
 
1.8%
b492
 
1.8%
Other values (88)91
79.8%
2025-11-21T05:12:43.453706image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B35
 
9.7%
235
 
9.7%
133
 
9.2%
C33
 
9.2%
330
 
8.4%
626
 
7.2%
822
 
6.1%
520
 
5.6%
919
 
5.3%
018
 
5.0%
Other values (9)88
24.5%
ValueCountFrequency (%)
C36
 
10.5%
335
 
10.2%
130
 
8.7%
B30
 
8.7%
229
 
8.5%
521
 
6.1%
819
 
5.5%
018
 
5.2%
D17
 
5.0%
717
 
5.0%
Other values (9)91
26.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)359
100.0%
ValueCountFrequency (%)
(unknown)343
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
B35
 
9.7%
235
 
9.7%
133
 
9.2%
C33
 
9.2%
330
 
8.4%
626
 
7.2%
822
 
6.1%
520
 
5.6%
919
 
5.3%
018
 
5.0%
Other values (9)88
24.5%
ValueCountFrequency (%)
C36
 
10.5%
335
 
10.2%
130
 
8.7%
B30
 
8.7%
229
 
8.5%
521
 
6.1%
819
 
5.5%
018
 
5.2%
D17
 
5.0%
717
 
5.0%
Other values (9)91
26.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)359
100.0%
ValueCountFrequency (%)
(unknown)343
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
B35
 
9.7%
235
 
9.7%
133
 
9.2%
C33
 
9.2%
330
 
8.4%
626
 
7.2%
822
 
6.1%
520
 
5.6%
919
 
5.3%
018
 
5.0%
Other values (9)88
24.5%
ValueCountFrequency (%)
C36
 
10.5%
335
 
10.2%
130
 
8.7%
B30
 
8.7%
229
 
8.5%
521
 
6.1%
819
 
5.5%
018
 
5.2%
D17
 
5.0%
717
 
5.0%
Other values (9)91
26.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)359
100.0%
ValueCountFrequency (%)
(unknown)343
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
B35
 
9.7%
235
 
9.7%
133
 
9.2%
C33
 
9.2%
330
 
8.4%
626
 
7.2%
822
 
6.1%
520
 
5.6%
919
 
5.3%
018
 
5.0%
Other values (9)88
24.5%
ValueCountFrequency (%)
C36
 
10.5%
335
 
10.2%
130
 
8.7%
B30
 
8.7%
229
 
8.5%
521
 
6.1%
819
 
5.5%
018
 
5.2%
D17
 
5.0%
717
 
5.0%
Other values (9)91
26.5%

Embarked
Categorical

 Dataset ADataset B
Distinct33
Distinct (%)0.7%0.7%
Missing10
Missing (%)0.2%0.0%
Memory size7.0 KiB7.0 KiB
S
313 
C
93 
Q
39 
S
317 
C
90 
Q
39 

Length

 Dataset ADataset B
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters445446
Distinct characters33
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowSQ
2nd rowCS
3rd rowSS
4th rowSS
5th rowCS

Common Values

ValueCountFrequency (%)
S313
70.2%
C93
 
20.9%
Q39
 
8.7%
(Missing)1
 
0.2%
ValueCountFrequency (%)
S317
71.1%
C90
 
20.2%
Q39
 
8.7%

Length

2025-11-21T05:12:43.527123image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2025-11-21T05:12:43.574597image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:43.610184image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
s313
70.3%
c93
 
20.9%
q39
 
8.8%
ValueCountFrequency (%)
s317
71.1%
c90
 
20.2%
q39
 
8.7%

Most occurring characters

ValueCountFrequency (%)
S313
70.3%
C93
 
20.9%
Q39
 
8.8%
ValueCountFrequency (%)
S317
71.1%
C90
 
20.2%
Q39
 
8.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)445
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S313
70.3%
C93
 
20.9%
Q39
 
8.8%
ValueCountFrequency (%)
S317
71.1%
C90
 
20.2%
Q39
 
8.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)445
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S313
70.3%
C93
 
20.9%
Q39
 
8.8%
ValueCountFrequency (%)
S317
71.1%
C90
 
20.2%
Q39
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)445
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S313
70.3%
C93
 
20.9%
Q39
 
8.8%
ValueCountFrequency (%)
S317
71.1%
C90
 
20.2%
Q39
 
8.7%

Interactions

Dataset A

2025-11-21T05:12:37.610534image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.501008image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.455261image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.317780image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.719576image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.573152image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.984665image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.847182image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.345548image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.235646image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.661504image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.550330image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.508277image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.364947image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.771319image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.626412image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.126979image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.901351image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.396222image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.287043image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.714464image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.604531image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.560809image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.419859image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.827367image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.684133image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.178062image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.954608image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.450852image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.341543image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.770512image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.658405image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.616227image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.473153image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.878573image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.737863image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.235992image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.125665image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.506225image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.397406image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.822711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.710755image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.667824image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.523118image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:36.933533image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:38.793401image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.290370image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.181306image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2025-11-21T05:12:37.558964image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:39.448659image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

Dataset A

2025-11-21T05:12:43.653196image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2025-11-21T05:12:43.742640image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

AgeEmbarkedFareParchPassengerIdPclassSexSibSpSurvived
Age1.0000.0000.097-0.2330.0970.2440.035-0.1780.156
Embarked0.0001.0000.2380.0000.0000.2710.1290.0380.134
Fare0.0970.2381.0000.383-0.0220.5420.1850.4230.272
Parch-0.2330.0000.3831.000-0.0010.0600.2080.4820.154
PassengerId0.0970.000-0.022-0.0011.0000.1000.000-0.0600.059
Pclass0.2440.2710.5420.0600.1001.0000.1350.1030.279
Sex0.0350.1290.1850.2080.0000.1351.0000.1970.583
SibSp-0.1780.0380.4230.482-0.0600.1030.1971.0000.218
Survived0.1560.1340.2720.1540.0590.2790.5830.2181.000

Dataset B

AgeEmbarkedFareParchPassengerIdPclassSexSibSpSurvived
Age1.0000.0000.204-0.1740.0300.2850.000-0.1230.079
Embarked0.0001.0000.1920.0320.0510.2940.1510.0400.214
Fare0.2040.1921.0000.392-0.0130.4790.1350.4170.267
Parch-0.1740.0320.3921.0000.0320.0000.1850.4000.000
PassengerId0.0300.051-0.0130.0321.0000.0000.122-0.0280.084
Pclass0.2850.2940.4790.0000.0001.0000.0940.1030.320
Sex0.0000.1510.1350.1850.1220.0941.0000.1500.543
SibSp-0.1230.0400.4170.400-0.0280.1030.1501.0000.092
Survived0.0790.2140.2670.0000.0840.3200.5430.0921.000

Missing values

Dataset A

2025-11-21T05:12:37.906042image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.

Dataset B

2025-11-21T05:12:39.792662image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.

Dataset A

2025-11-21T05:12:37.975536image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Dataset B

2025-11-21T05:12:39.859631image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Dataset A

2025-11-21T05:12:38.051898image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Dataset B

2025-11-21T05:12:39.928205image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Dataset A

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
23123203Larsson, Mr. Bengt Edvinmale29.0003470677.7750NaNS
37837903Betros, Mr. Tannousmale20.00026484.0125NaNC
11912003Andersson, Miss. Ellis Anna Mariafemale2.04234708231.2750NaNS
88488503Sutehall, Mr. Henry Jrmale25.000SOTON/OQ 3920767.0500NaNS
48448511Bishop, Mr. Dickinson Hmale25.0101196791.0792B49C
29229302Levy, Mr. Rene Jacquesmale36.000SC/Paris 216312.8750DC
737403Chronopoulos, Mr. Apostolosmale26.010268014.4542NaNC
72472511Chambers, Mr. Norman Campbellmale27.01011380653.1000E8S
33733811Burns, Miss. Elizabeth Margaretfemale41.00016966134.5000E40C
33333403Vander Planke, Mr. Leo Edmondusmale16.02034576418.0000NaNS

Dataset B

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
65365413O'Leary, Miss. Hanora "Norah"femaleNaN003309197.8292NaNQ
52652712Ridsdale, Miss. Lucyfemale50.000W./C. 1425810.5000NaNS
56957013Jonsson, Mr. Carlmale32.0003504177.8542NaNS
717203Goodwin, Miss. Lillian Amyfemale16.052CA 214446.9000NaNS
37637713Landergren, Miss. Aurora Adeliafemale22.000C 70777.2500NaNS
29329403Haas, Miss. Aloisiafemale24.0003492368.8500NaNS
12012102Hickman, Mr. Stanley Georgemale21.020S.O.C. 1487973.5000NaNS
727302Hood, Mr. Ambrose Jrmale21.000S.O.C. 1487973.5000NaNS
13613711Newsom, Miss. Helen Monypenyfemale19.0021175226.2833D47S
55255303O'Brien, Mr. TimothymaleNaN003309797.8292NaNQ

Dataset A

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
24024103Zabour, Miss. ThaminefemaleNaN10266514.4542NaNC
21021103Ali, Mr. Ahmedmale24.000SOTON/O.Q. 31013117.0500NaNS
69869901Thayer, Mr. John Borlandmale49.01117421110.8833C68C
79779813Osman, Mrs. Marafemale31.0003492448.6833NaNS
333402Wheadon, Mr. Edward Hmale66.000C.A. 2457910.5000NaNS
505103Panula, Master. Juha Niilomale7.041310129539.6875NaNS
232411Sloper, Mr. William Thompsonmale28.00011378835.5000A6S
25525613Touma, Mrs. Darwis (Hanne Youssef Razi)female29.002265015.2458NaNC
49449503Stanley, Mr. Edward Rolandmale21.000A/4 453808.0500NaNS
28328413Dorking, Mr. Edward Arthurmale19.000A/5. 104828.0500NaNS

Dataset B

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
48949013Coutts, Master. Eden Leslie "Neville"male9.011C.A. 3767115.9000NaNS
21521611Newell, Miss. Madeleinefemale31.01035273113.2750D36C
32532611Young, Miss. Marie Gricefemale36.000PC 17760135.6333C32C
1211Cumings, Mrs. John Bradley (Florence Briggs Thayer)female38.010PC 1759971.2833C85C
21321402Givard, Mr. Hans Kristensenmale30.00025064613.0000NaNS
39940012Trout, Mrs. William H (Jessie L)female28.00024092912.6500NaNS
34034112Navratil, Master. Edmond Rogermale2.01123008026.0000F2S
62162211Kimball, Mr. Edwin Nelson Jrmale42.0101175352.5542D19S
989912Doling, Mrs. John T (Ada Julia Bone)female34.00123191923.0000NaNS
53954011Frolicher, Miss. Hedwig Margarithafemale22.0021356849.5000B39C

Duplicate rows

Dataset A

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked# duplicates
Dataset does not contain duplicate rows.

Dataset B

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked# duplicates
Dataset does not contain duplicate rows.